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INCOMPATIBILITY OF TRENDS IN MULTI-YEAR ESTIMATES 
FROM THE AMERICAN COMMUNITY SURVEY 

By Tucker McElroy 

U.S. Census Bureau 

The American Community Survey (ACS) provides one-year (ly), 
three-year (3y) and five-year (5y) multi-year estimates (MYEs) of 
various demographic and economic variables for each "community," 
although the ly and 3y may not be available for communities with 
a small population. These survey estimates are not truly measur- 
ing the same quantities, since they each cover different time spans. 
Using some simplistic models, we demonstrate that comparing differ- 
ent period-length MYEs results in spurious conclusions about trend 
movements. A simple method utilizing weighted averages is presented 
that reduces the bias inherent in comparing trends of different MYEs. 
These weighted averages are nonparametric, require only a short span 
of data, and are designed to preserve polynomial characteristics of the 
time series that are relevant for trends. The basic method, which only 
requires polynomial algebra, is outlined and applied to ACS data. In 
some cases there is an improvement to comparability, although a fi- 
nal verdict must await additional ACS data. We draw the conclusion 
that MYE data is not comparable across different periods. 

1. Introduction. The American Community Survey (ACS) replaces the 
former Census Long Form, providing timely estimates available throughout 
the decade. The ACS sample size is comparable to that of the Census Long 
Form; variability in the sampling error component of the ACS is partially 
reduced through a rolling sample [Kish (1981)]. The rolling sample refers 
to the pooling of sample respondents over time — in some cases this may be 
viewed as an approximate temporal moving average of single period esti- 
mates. In particular, estimates from regions with at least 65,000 people are 
produced with a single year of data, whereas if the population is between 
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20,000 and 65,000, then three years of data are combined, and if the pop- 
ulation is less than 20,000, then five years of data are pooled. A somewhat 
dated overview of the ACS can be found in Alexander (1998). More current 
details can be found in the Census Bureau (2006) and Torrieri (2007). 

In order to examine longer time series of ACS data, it is necessary to 
examine older estimates published for a small group of regions in the Multi- 
Year Estimates Study (MYES), which is publicly available at www. census, 
gov / acs / www / AdvMeth /MultLYear .Estimates / online_data_year.html. 

The MYES was a trial study for the ACS that produced one, three and 
five year estimates for counties included in the 1999-2001 demonstration 
period and their constituent geographies, using data from 1999 through 
2005. The Multi-Year Estimates (MYEs) are divided according to period- 
length — either one-year (ly), 1 three-year (3y) or five-year (5y) — the time 
period, the county and the geographic type within the county (e.g., school 
district). There are hundreds of variables available, which are broken into 
four categories: demographic, economic, social and housing. Most of the 
variables are totals, averages, medians or percentiles. 

Because some counties have a low population, it was deemed desirable 
by the U.S. Census Bureau to decrease sampling error for smaller geogra- 
phies and subpopulations by using a rolling sample; a discussion of issues 
associated with this methodology can be found in the National Academy of 
Sciences Panel on the Functionality and Usability of Data from the Ameri- 
can Community Survey [Citro and Kalton (2007)]. In essence, responses over 
a 3y or even a 5y span are gathered together into one database, and a statis- 
tic of interest is computed over the temporally enlarged sample. In many 
cases, this is approximately equal to computing a simple moving average of 
ly estimates. This is known as a rolling sample — see Kish (1981, 1998) and 
Alexander (2001) for a discussion. For larger counties, the ly MYE would 
be available as well. The question of whether each year should be equally 
weighted was addressed in Bell (1998) and Breidt (2007); since all the re- 
sponses are pooled in the 3y and 5y cases, the U.S. Census Bureau judged 
that it would be impractical to use some alternative weighting scheme (such 
as weighting the most recent year of data more highly). Hence, the MYEs 
are formed from contributions over multiple years that are equally weighted. 
Although this approach is simple, one repercussion is that some lag (or time 
delay) is induced by the use of rolling samples (whereas an unequal weight- 
ing scheme can be devised such that time delay is reduced or eliminated for 
certain components of the time series). 

The time delay effect is easy to understand in the case that the data is 
a simple polynomial, such as a line or a quadratic. In the former 
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TRENDS FOR ACS 



3 



three-period average induces a time delay of exactly one time unit, whereas 
the five-period average delays the line by two time units. For higher de- 
gree polynomials the delay is not exact, and yet visually there is a definite 
shift in the graph of one or two units. Assuming that trends in ACS MYEs 
are locally given by low-degree polynomials, this brief discussion illustrates 
the problem with comparing MYEs of different period lengths (and this is 
further expounded in Sections 2 and 3 below). In particular, making com- 
parisons across regions of MYEs of different period lengths will in general 
lead to false conclusions and spurious deductions, and therefore should be 
avoided. This paper assesses the extent of this problem through some ex- 
tremely simple models, and proposes a class of trend-preserving weighted 
averages that can be used to illustrate and identify the sorts of false con- 
clusions arising from such inter-period comparisons. The perspective of this 
author is that such cross-period MYE comparisons should not be made for 
reasons discussed in the subsequent sections. Although use of the proposed 
weighted averages in this paper may well, in some cases, reduce the quantity 
of spurious conclusions drawn from the data, it is acknowledged that they 
do not provide a full solution to the problem of incomparability. 

In Section 2 we provide additional discussion of the construction of MYEs, 
explicating the practical factors militating against inter-period comparisons. 
Then in Section 3 we discuss a simple model for MYEs that focuses on the 
temporal aspects, while ignoring sampling error for simplicity. Using this 
formal approach, we can illustrate in a quantitative fashion the pitfalls that 
may occur from making cross-period MYE comparisons. In Section 4 we 
propose a system of weighted averages that preserve any local polynomial 
trends, ensuring that these trends for ly, 3y and 5y are identical after ap- 
plication of the weights. This is a general technique based on simple time 
series analysis and polynomial algebra, and we apply it in the linear trend 
case to MYE data in Section 5, making use of the newly available ACS 
data extended by the trial period of the MYES. Through several exam- 
ples, we illustrate the dangers of making inappropriate comparisons, that 
is, cross-region comparisons involving MYEs of different period lengths. Fi- 
nally, Section 6 summarizes the results of the paper and the main difficulties 
in inter-period comparisons. 

2. Practical issues in making comparisons. Beyond the issues of time 
delay raised in the Introduction and further described below, there is a 
problem comparing MYEs of different period lengths due to the differences 
in how the estimates are constructed. A detailed discussion of these issues is 
beyond the scope of this paper [for more information the reader is referred 
to Fay (2007), Starsinic and Tersine (2007), and Tersine and Asiala (2007)], 
but here we briefly highlight some relevant points. 
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In the construction of MYEs a weighting method is used that is differ- 
ent for ly versus 3y and 5y. In the former case, baseweights are used that 
are defined as the inverse of sampling probabilities, with some differences 
between Housing Units (HU) and Group Quarters (GQ). Next, there is a 
nonresponse adjustment followed by the application of controls to a set of in- 
dependent HU estimates derived from the U.S. Census Bureau's Population 
Estimates Program (GQs are handled with separate controls). For the 3y 
and 5y estimates, similar weighting and adjustments are made, but based off 
of data pooled over the whole three years and five years respectively. More- 
over, housing unit controls are further modified by the so-called g-weighting 
(a type of calibration) [see Fay (2005, 2006, 2007)], with the objective of re- 
ducing (sampling error) variances at the sub-county aggregation level. This 
process involves linking administrative records data with the ACS sampling 
frame [Starsinic and Tersine (2007)]. 

As a result of g-weighting, the 3y and 5y estimates are fundamentally 
different in their construction from the ly. We also point out that, apart 
from the g-weighting, there is also the issue of additional pooling in 3y and 
5y prior to weighting and nonresponse adjustment; thus, a 5y estimate will 
have effectively five times as many sample cases receiving weighting over the 
ly estimate. Furthermore, the population controls will vary between MYEs, 
since the vintage of the population estimates will correspond to the final 
year in the particular MYE. So the 3y MYE for 2005, 2006 and 2007 is 
controlled to the average population for those years at a 2007 population 
vintage, whereas the ly MYE for each of the corresponding years 2005, 
2006 and 2007 will each be based off population vintages from those three 
years; this further interferes with comparability. A related issue is inflation 
adjustment for monetary variables, which is handled by controlling to dollars 
in the latest year of the period. 

These are fundamental incompatibilities; one may see that ly, 3y and 5y 
are really measuring different quantities. The weighted average methodol- 
ogy of this paper — presented below — can address the issue of pooling in an 
approximate fashion, but does not provide a resolution to the effects of g- 
weighting, nonresponse adjustment and variable (population and monetary) 
vintages. However, given that it is common in trend analysis of demographic 
and economic time series to compare data that have no common basis of mea- 
surement [e.g., consumption versus income is analyzed for co-integration in 
Engle and Granger (1987)], it is only vital to account for time delay shifts 
in the respective time series. Although such weighted MYEs are not strictly 
comparable, they can still be used as subjects in such a longitudinal or mul- 
tivariate analysis, just as similar situations are treated throughout the social 
sciences [see Granger (2004)]. 
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3. Comparing MYEs. This section develops the issue of comparability 
in a mathematical framework, so that we can obtain a quantitative view 
of why inter-period comparisons are problematic. The MYEs are currently 

(k) 

available as an annual time series, and we use the notation Y t for the ky 
MYE available at year t, where k = 1,3,5. We define the Simple Moving 
Average (SMA) polynomial of order k by 

e( fc )(z) = i(i + z + --- + z fc - 1 ). 

As usual, B denotes the backshift operator. Because of the method of 
construction of the MYEs described in Section 1, we might think that 
= e( 5 )(£)Y t (1) and Y t (3) = 9( 3 )(-B)Y t (1) are approximately true equa- 
tions [such an assumption is used for certain variance calculations in Citro 
and Kalton (2007)]. However, in our experience this approximation is poor 
for many variables, and is fair for only a few variables — typically those in- 
volving linear statistics such as totals and averages. Therefore, we adopt the 
following error model for the purpose of demonstrating issues of compara- 
bility of trends: 

(1) Yi k) = & k \B)n t + e?\ 

for k = 1,3,5. Here /it is a common deterministic trend function, and the 

(k) 

errors e t include sampling error, serially correlated stochastic trend per- 
turbations and "nonadditive error," that is, the error attributed to assuming 
a moving average relationship to be valid. We will not be concerned with the 
statistical properties of these errors, though they are assumed to be iden- 
tically distributed in t with mean zero. The common trend [it is conceived 
of abstractly, and does not necessarily have a fundamental interpretation in 
terms of the population trend. Although other models could be considered 
[such as Y t (k) = e^\B)(fi t + ef \ (1) will be sufficient for our illustrative 
purposes. 

(k) 

Now suppose that we have two time series of MYEs, denoted Y t (with 

trend \x( and error process s[ ) and z\ (with trend fif and error process 

T][ ). These MYEs may correspond to two different geographical regions, and 
a practitioner may be interested in comparing the trends \i{ and /if, either 
at several time points or perhaps at just one time to. Formally, we might 
consider the following hypotheses, although many others are conceivable: 

In this formulation, the values of the mean at time to simply become pa- 
rameters, and it is the statistician's task to devise parameter estimates that 
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are accurate and precise. Since typically in applications it is desirable to 
make trend comparisons in real-time, any estimators must be a function of 
present and past data only, that is, fij and p,f are functions of the MYE 

series at times to, to — 1, . . . . The simplest unbiased estimators are jlj = 

and p,f = Zfr. , but the ly MYEs are not always available. Suppose that the 
first region (Y) includes ly, 3y and 5y period MYEs, but the second (Z) 
includes only 3y and 5y. 

Commonly, users of MYEs (despite official cautions to the contrary) will 
take fij = Y^ and jlf = zj® [or even equal to Z^], even though the latter 
is a biased estimate [due to the phase delay of Q^ 3 \B); see below] of the 
trend. We refer to this as the "inapt" comparison. Seeking to mitigate the 
phase delay, we can put both trend estimates on an equal footing by taking 
fij = Y£' and jlf = Z^' . Now both trend estimates are biased, but at least 
they are biased in a similar fashion; this will be called the "untimely" com- 
parison. A "proper" comparison is one in which both estimates are unbiased 
for their respective trend values. Of course, even for a proper comparison 
Type I and II errors will occur due to statistical uncertainty, but at least 
the bias will be eliminated. 

One could test the hypothesis of equal trends via fij — jlf ; this has the 
following expectation for the inapt comparison: nJ Q — (fj,f +^_ 1 + ^_ 2 )/3, 
which need not be zero under Hq. For the untimely comparison, the expec- 
tation would be 

(G< " /4) + W-i " Mfe-i) + (Mto-a " /4- 2 ))A 

If the trends agree at times to, to — 1, and to — 2, this quantity is zero; 
however, some bias is to be expected under Ho- In contrast, it is clear from 
the definition of the proper comparison that the mean of p*Y Q — p,f is zero 
under Hq. 

From this discussion, we see that making inferences about trends based on 
a direct use (i.e., by looking just at the values rather than some more com- 
plicated statistics) of MYEs of different period lengths leads to bias even in 
the case that a highly idealized model holds true. The incidence of spurious 
conclusions (i.e., Type I errors) can be reduced by making proper compar- 
isons, and we explore this further in the following section. However, even 
proper comparisons have their limitations, and our attitude is that MYEs of 
different period length should not be compared; using a proper comparison 
provides an improvement, but false conclusions can still be obtained (not to 
speak of the practical issues raised in Section 2). 

We note that the incomparability of trends increases with the dispersion 

(k) 

of the errors e\ , if these errors were zero, then the rolling sample would 
be exactly a moving average, and a proper comparison would enable full 
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comparability of MYE trends. A crude assessment of the size of these errors, 
relative to the trend, is given by the "Noise-Signal Ratio" (NSR) 

J) V (k) 

5 t _ I t ^ 

Q( k )(B)fi t Q( k )(B)fi t 

This is only well-defined when Q^ k \B)/j,t is nonzero, and we generally sup- 
pose that it is positive at all times. Since we do not know fit, we can sub- 
stitute when the ly MYEs are available. Then for k = 3, 5, we have 
Yt (B)Yt — 1 as our estimate of the NSR. For convenience, we will 

instead use logarithms of noise and signal, which are approximated (by first- 
order Taylor series) by the former expression: 

NSR^ = log Y 4 (fc) - lo g eW(S)F t (1) 

for k = 3,5. Computing this quantity at all available times t, we define a 
compatibility measure by 

C {k) = max\NSR? ) \. 
t 

If this measure is small, for example, C^ k > = 0.01, then the rolling sample is 
well-approximated by a moving average, and the proper comparison is more 
meaningful. 

4. Trend-preserving weighted averages. In what follows, the function 
of the model (1) is to illustrate the incomparability of MYEs of different 
period length; we are not interested in fitting the model to actual MYEs 
in order to pursue statistical inference. In this sense, the model only serves 
a pedagogical purpose. Next, suppose that fit is given by a polynomial of 
degree d in t. Is it possible to find sets of weighted averages, or linear filters, 
such that when applied to each MYE the trends will coincide? That is, if we 
view the underlying trend of the ky MYE as Q^ k \B)fit, then we seek three 
filters such that (B)e^\B)fi t is the same for each k = 1,3,5; 

or, in other words, 

(2) = ¥ 3 \z)G i3 \z) = ^\z)e {5) {z). 

Since users are typically interested in comparisons utilizing the most current 
data available, it makes sense to formulate our problem with concurrent 
filters, that is, filters that only depend on present and past data. Therefore, 
each filter is of the form 

*(*>(*) = £^ fc V. 

i>o 
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In practice, only a finite number of the coefficients i/jj are nonzero. Now 
a filter ^f(z) will pass (i.e., leave invariant) a polynomial of degree d if 
f(l) = 1 and £j^(z)\ z=1 = for 1 < j < d [Brockwell and Davis (1991), 

page 39]. Now using (2) and the fact that @( 3 \z) and Q^ 5 \z) share no 
common roots, it is easy to see that 

^( 1 )(z) = $(z)e (3) (z)G (5) (z). 

We are free to design the polynomial <&(z) such that the polynomial-passing 
constraints are satisfied; hence, &(z) must have degree at least d. The fol- 
lowing theorem describes how to construct this polynomial. 

Theorem 1. The minimal length concurrent filters \lK fc ) that pass degree 
d polynomials and satisfy (2) are given by 

^\ z ) = ^(z)e (3) (z), 
^(z) = ^(z)e i5) {z), 
^ 1) (z) = ^(z)e i3) {z)e i5) (z), 

where the coefficients of &(z) are given by the first column of the inverse of 
the matrix with entry jk given by 

Q3-i 



dzi- 



k - 1 e {3) (z)e^{z)} 



PROOF. Let Q(z) = Q^ 3 \z)Q ( - 5 \z), with 4> k the coefficients of <&(z). Ap- 
plying the polynomial-passing constraints yields 

j 



1=0 

j 

=E 



j\ d*(z) 
I dz l 



dO(z) 



1=0 



=1 9Z3~ 1 

k\ d@(z 



2 = 1 



k=0 



(k-l)\ dzi~ l 



2 = 1 



J2^g-[z k e(z)] 



k=0 



z=l 



This is easily rewritten in matrix form, from which the result follows. □ 

Example (Linear trends). Supposing that the trend is linear and d = l, 
we have 

^)(z) = {A + z + z 2 -3z 3 )/3, 

*(3) ( z ) = (4 + z + z 2 + z 3 + z 4 - 3z 5 )/5, 



*W ( z ) = (4 + 5z + 6z 2 + 3z 3 + 3z 4 



2z 6 -3z 7 )/15. 
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Example (Quadratic trends). Supposing that the trend is quadratic 
and d = 2, we have 

y {5 \z) = (26 - llz + 3z 2 - 23z 3 + 14z 4 )/9, 

( z ) = (26 - llz + 3z 2 + 3z 3 + 3z 4 - 23z 5 + 14z 6 )/15, 

^W(^) = (26 + 15z + 18z 2 - 5z 3 + 9z 4 - 17z 5 - 6z 6 - 9z 7 + 14z 8 )/45. 

Theorem 1 has the following interpretation. If one wishes to make a proper 
comparison of MYEs (defined in Section 3) that preserves polynomials of 
order d, then the minimal length linear filters that accomplish this goal are 
given by Theorem 1. 

5. Illustrations on ACS data. We now provide three illustrations of the 
concepts discussed in this article. We focus on Median Household Income 
in Pima, AZ, Number of Divorced Males in Lake, IL, and Median Age in 
Hampden, MA. These three counties are included in the MYES and, there- 
fore, the data extends back to the year 2000. In particular, the following 
MYEs are available: 2000 through 2007 for ly, 2001 through 2005 and 2007 
for 3y, and 2003 through 2005 for 5y. The year index here refers to the last 
year that entered into the sample, and so is consistent with our notation for 

(k) 

Y t . Current ACS estimates are now available for all geographical regions, 
covering the ly years 2006 and 2007, and the 3y MYE 2005-2007 has just 
become available. Letting t range between 00 and 05 (referring to the year), 

the available database is Y$ , . . . , Y$ , Y$ Y$ , Y ( 7 3) , Y$ Y^ ] . In 
order to apply our methods, we need to impute (by forecasting) the 3y MYE 
Y ( 6 3) and the 5y MYEs Y ( 6 5) and Y ( 7 5) . (Th is is a provisional necessity, since 
in the future full time series data for all counties will be published.) 

The missing values are obtained by forecasting them utilizing a simple 
random walk model, which is feasible for these time series based on economic 
and demographic considerations (to actually fit a time series model to such 
a short series is pointless): 

<>(3) _ l (v m , V (3K 
^06 — 2V- r 05 1 07 )i 

CK5) _ V (5) , l/ v (5) v (5)x 
I 06 — J 05 2^05 I 03 )i 

0(5) _ v (5) , 2/ v (5) _ v (5h 
2 07 — 2 05 2V 2 05 2 03 )• 

The MYEs (with imputed values in bold) are given in Table 1. The final row 
of the table gives the various 2007 trend values estimated via the method of 
Section 4 [the data and calculations are given in McElroy (2009)]. Note that 

Yq^ and are not used in the calculation of these trend estimates. Al- 
though the Income MYEs follow a linear growth pattern, the Divorce MYEs 
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fluctuate more in their slope component, whereas the Age MYEs trend up- 
ward very slowly with little noise. Thus, we might say that Income and Age 
exhibit linear trend lines, whereas Divorce is nonlinear; it is important to 
consider different types of trend behavior in order to evaluate this paper's 
method. 

As far as the linear approximation to the rolling sample, we can compute 
the NSR comparability measure for years 2002-2007 for k = 3, and 2004- 
2007 for k = 5 (by including the forecasted data). For Income = 0.017 
and = 0.020, indicating some incompatibility. For the Divorce variable 
= 0.008 and = 0.042, indicating a high amount of incomparability 
(though most of this comes from the portion of the data that is forecasted, 
and thus might be resolved when the real numbers are published). Finally, 
the Age variable is highly compatible with = 0.002 and C (5) = 0.004. 

Now imagine having two replications of each variable for two separate re- 
gions: county A with all period-length MYEs available, and county B with a 
lower population such that only 3y and 5y MYEs are available. Starting with 
the Divorce variable, an illustration of the time delay properties of MYEs is 
provided in comparing ly to one-year-ahead-3y MYEs; there is a fairly close 
match up until the 2005 ly MYE and 2006 3y MYE. However, this latter 
value is imputed, and the true value could easily have decreased from 2005; 
instead the imputation increases merely because there is so much gain in the 
2007 3y MYE. The 2007 "inapt" comparison discussed in Section 3 would 
then compare 21,844 with 18,852 or 16,417; these are -13.7% and -24.8% 
discrepancies. If we use weighted averages for comparing trends, the discrep- 
ancies are reduced to —0.59% and —13.6% respectively (though given the 
nonlinear nature of the trend, we expect the forecasts to be inappropriate, 

Table 1 

MYEs for Income, Divorce and Age. Estimates have been forecast extended for the years 

06 and 07, written in bold 



Income MYEs Divorce MYEs Age MYEs 



Year 


ly 


3y 


5y 


ly 


3y 


5y 


ly 


3y 


5y 


00 


35223 






14043 






36.40 






01 


35615 


35956 




14376 


14429 




37.30 


36.80 




02 


37638 


36780 




17866 


15504 




37.00 


36.80 




03 


37818 


37373 


37510 


17398 


16772 


15473 


37.10 


37.00 


36.70 


04 


38800 


38739 


38608 


15632 


17156 


15903 


37.20 


37.10 


36.90 


05 


41521 


40404 


40055 


14591 


15889 


15945 


37.40 


37.30 


37.20 


06 


42984 


42395 


41328 


20941 


17371 


16181 


37.40 


37.35 


37.45 


07 


43546 


44386 


42600 


21844 


18852 


16417 


37.60 


37.40 


37.70 


Trend 


43570 


45223 


45320 


19331 


19217 


16695 


37.59 


37.59 


38.25 
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and hence not as much emphasis should be placed on the 5y MYEs). In this 
case the weighted average methodology helps to properly align the series. 

For the Income and Age time series data, which both exhibit linear 
trends (with the former having much more variability), the weighted av- 
erage method can actually increase discrepancies. In the former case, the 
discrepancies of 1.9% and —2.2% become 3.8% and 4.0%; but for Age the dis- 
crepancies of —0.53% and 0.27% become 0% and 1.8% after using weighted 
averages. The Age data is very stable, and here an inapt comparison indi- 
cates no change. We have not analyzed these percentages statistically, as 
this would require actual modeling of the time series. Nevertheless, a rough 
idea about trend comparability can be deduced by the discussion here. 

In summary, we see through these examples that the weighted average 
methodology can either increase or decrease discrepancies in some cases, 
and seems to work less well with 5y versus 3y MYEs (although this may 
also be an artifact of two imputations in the 5y MYEs). Part of this in- 
crease in discrepancy is due to the weighted averages increasing the overall 
variance (even if they reduce the bias of direct comparisons, as discussed in 
Section 3); if in (1) we make the crude assumption that the errors are 
i.i.d., then the linear weights innate the variance by a factor of 1.16 and 3 
respectively for the 3y and 5y MYEs. For the ly MYE the variance is mul- 
tiplied by 0.48, but of course this MYE has the greatest variability since its 
sampling error component is largest. This variance inflation can be corrected 
by imposing extra conditions on the filter coefficients, but the result would 
be an even longer set of weights. It can also be observed that the random 
walk model used for forecasting is poorly suited to the Divorce data, since 
the change in direction from 2003 to 2004 in the ly MYE is not reflected 
in the corresponding time-delayed 5y MYEs of 2005-2006. A more defini- 
tive study would not rely on imputations, and would be concerned with the 
qualitative aspects of trends produced by weighted averages; such a study 
must wait at least five years due to the current ACS publication schedule. 

6. Conclusion. The aim of this paper is first to discuss the challenges 
in comparing cross-period MYEs. Due to the way in which MYEs are con- 
structed, it is apparent that ly, 3y and 5y MYEs are different time series — 
and not just time-lagged or smoothed versions of some underlying series; 
they are estimates of different fundamental quantities (see Section 2). Nev- 
ertheless, this fact does not preclude a user from making cross-period com- 
parisons, any more than it would be forbidden to search for common trends 
in economic or demographic data. Therefore, the second aim of this paper 
is to quantitatively assess what sorts of mathematical and statistical prob- 
lems will arise in such comparisons (see Sections 3 and 4). As a third aim, 
the weighted averages method can be used to reduce the bias inherent in 
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such cross-period comparisons [under certain quasi-linear assumptions such 
as (1)]; even so, the statistical variation in MYEs is such that sizeable dis- 
crepancies can still crop up, as demonstrated in Section 5. 

In summary, the author wishes to echo the strong cautions against making 
cross-period comparisons issued by the U.S. Census Bureau [see Beaghen and 
Weidman (2008) and Citro and Kalton (2007)]. At this point the weighted 
average methodology mainly serves to identify fairly egregious types of false 
conclusions derived from such unwarranted comparisons, but perhaps it can 
also serve as a building block for future work on comparability and usability 
issues in the ACS. 
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SUPPLEMENTARY MATERIAL 

Income, Divorce and Age Data with Trend Calculations 

(DOI: 10.1214/09-AOAS259SUPP; .zip). This file contains the Income, Di- 
vorce and Age data of Table 1 in Excel format. Also provided are the linear 
trend weighted averages along with compatibility measures NSR, encoded 
as Excel formulas. 
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